27 research outputs found

    Pattern Classification by an Incremental Learning Fuzzy Neural Network

    Get PDF
    To detect and identify defects in machine condition health monitoring, classical neural classifiers, such as Multilayer Perceptron (MLP) neural networks, are proposed to supervise the monitored system. A drawback of classical neural classifiers, off-line and iterative learning algorithms, is a long training time. In addition, they are often stuck at local minima, unable to achieve the optimum solution. Furthennore, in an operating mode, it is possible that new faults are developing while a monitored system is running. These new classes of defects need to be instantly detected and distinguished from those that have been trained to the classifier. Those classical neural classifiers need to be retrained by both old and new patterns in order to learn new patterns without forgetting the learned patterns. Conventional classifiers cannot detect and learn the new fault types on-line real-time. Using incremental learning algorithms in the monitoring system it is possible to detect those new defects of machine conditions with the system operating while maintaining oLd knowledge. Inspired by the promising properties of an incremental learning algorithm named Fuzzy ARTMAP Neural Network, a new algorithm suitable for pattern classification based on fuzzy neural networks called an Incremental Learning Fuzzy Neuron Network (ILFN) is developed. The ILFN uses Gaussian neurons to represent the distributions of the input space, while the fuzzy ARTMAP neural network uses hyperboxes. The ILFN employs a hybrid supervised and unsupervised learning scheme to generate its prototypes. The network is a self-organized classifier with the capability of adaptive learning of new information without forgetting old knowledge. The classifier can detect new classes of patterns and update its parameters while in an operating mode. Moreover, it is an on-line (real-time) and fast learning algorithm without knowing a priori information. In addition, it has the capability to make soft (fuzzy) and hard (crisp) decisions, and.it is able to classify both linear separable and nonlinear separable problems. To prove the concept, simulations have been performed with the vibration data known as the Westland Data Set. This data set was obtained from the Internet at http://wisdom.ar1.psu.edulWestland/ collected from U.S. Navy CH-46E helicopters maintained by Applied Research Laboratory (ARL) at Penn State University. Using a simple Fast Fourier Transform (FFT) technique for the feature extraction part, the network, capable of one-pass, on-line, and incremental learning performed quite well. Training by various torque levels, the network achieved 100% correct prediction for the same torque level of testing data. Furthermore, the classification performance of the network has been tested using other benchmark data, such as the Fisher's Iris data, the two-spiral problem, and a vowel data set. Comparison studies among other well-known classifiers were preformed. The ILFN was found competitive with or even superior to many classifiers

    The Evaluated Measurement of a Combined Genetic Algorithm and Artificial Immune System

    Get PDF
    This paper demonstrates a hybrid between two optimization methods which are the Artificial Immune System (AIS) and Genetic Algorithm (GA). The novel algorithm called the immune genetic algorithm (IGA), provides improvement to the results that enable GA and AIS to work separately which is the main objective of this hybrid. Negative selection which is one of the techniques in the AIS, was employed to determine the input variables (populations) of the system. In order to illustrate the effectiveness of the IGA, the comparison with a steady-state GA, AIS, and PSO were also investigated. The testing of the performance was conducted by mathematical testing, problems were divided into single and multiple objectives. The five single objectives were then used to test the modified algorithm, the results showed that IGA performed better than all of the other methods. The DTLZ multiobjective testing functions were then used. The result also illustrated that the modified approach still had the best performance

    A critical assessment of imbalanced class distribution problem: the case of predicting freshmen student attrition

    Get PDF
    Predicting student attrition is an intriguing yet challenging problem for any academic institution. Class-imbalanced data is a common in the field of student retention, mainly because a lot of students register but fewer students drop out. Classification techniques for imbalanced dataset can yield deceivingly high prediction accuracy where the overall predictive accuracy is usually driven by the majority class at the expense of having very poor performance on the crucial minority class. In this study, we compared different data balancing techniques to improve the predictive accuracy in minority class while maintaining satisfactory overall classification performance. Specifically, we tested three balancing techniques—oversampling, under-sampling and synthetic minority over-sampling (SMOTE)—along with four popular classification methods—logistic regression, decision trees, neuron networks and support vector machines. We used a large and feature rich institutional student data (between the years 2005 and 2011) to assess the efficacy of both balancing techniques as well as prediction methods. The results indicated that the support vector machine combined with SMOTE data-balancing technique achieved the best classification performance with a 90.24% overall accuracy on the 10-fold holdout sample. All three data-balancing techniques improved the prediction accuracy for the minority class. Applying sensitivity analyses on developed models, we also identified the most important variables for accurate prediction of student attrition. Application of these models has the potential to accurately predict at-risk students and help reduce student dropout rates

    Incremental learning algorithm based on support vector machine with Mahalanobis distance (ISVMM) for intrusion prevention

    Get PDF
    In this paper we propose a new classifier called an incremental learning algorithm based on support vector machine with Mahalanobis distance (ISVMM). Prediction of the incoming data type by supervised learning of support vector machine (SVM), reducing the step of calculation and complexity of the algorithm by finding a support set, error set and remaining set, providing of hard and soft decisions, saving the time for repeatedly training the datasets by applying the incremental learning, a new approach for building an ellipsoidal kernel for multidimensional data instead of a sphere kernel by using Mahalanobis distance, and the concept of handling the covariance matrix from dividing by zero are various features of this new algorithm. To evaluate the classification performance of the algorithm, it was applied on intrusion prevention by employing the data from the third international knowledge discovery and data mining tools competition (KDDcup'99). According to the experimental results, ISVMM can predict well on all of the 41 features of incoming datasets without even reducing the enlarged dimensions and it can compete with the similar algorithm which uses a Euclidean measurement at the kernel distance

    Effectiveness of Word Extraction and Information Retrieval on Cancer from Thai Website

    Get PDF
    This article proposes word extraction and cancer information retrieval from the Thai website. For word extraction, TH-OnSeg is proposed as a words segmentation based on LexTo algorithm with cancer dictionary and cancer oncology. TH-Onseg is used to extract cancer related words to be used as document indexing for cancer websites. The experiments were conducted by comparing the word extraction with LexTo words segment algorithm based on Thai electronic dictionary. The results show that the TH-OnSeg technique has higher efficiency; it can extract more words than LexTo for unknown words, known words, and ambiguous words.  In addition, we propose a semantic web-based technique combined with n-grams for cancer information retrieval. The experiments were conducted by comparing the proposed technique with information retrieval methods in database.  The results show that the use of semantic web techniques combined with N-gram for cancer information retrieval yields the highest number of cancer websites. The highest recall is not less than 0.9 in all experimental cases of both misspellings and misspellings
    corecore